A genome-wide association study in a large community-based cohort identifies multiple loci associated with susceptibility to bacterial and viral infections

There is limited data on host-specific genetic determinants of susceptibility to bacterial and viral infections. Genome-wide association studies using large population cohorts can be a first step towards identifying patients prone to infectious diseases and targets for new therapies. Genetic variants associated with clinically relevant entities of bacterial and viral infections (e.g., abdominal infections, respiratory infections, and sepsis) in 337,484 participants of the UK Biobank cohort were explored by genome-wide association analyses. Cases (n = 81,179) were identified based on ICD-10 diagnosis codes of hospital inpatient and death registries. Functional annotation was performed using gene expression (eQTL) data. Fifty-seven unique genome-wide significant loci were found, many of which are novel in the context of infectious diseases. Some of the detected genetic variants were previously reported associated with infectious, inflammatory, autoimmune, and malignant diseases or key components of the immune system (e.g., white blood cells, cytokines). Fine mapping of the HLA region revealed significant associations with HLA-DQA1, HLA-DRB1, and HLA-DRB4 locus alleles. PPP1R14A showed strong colocalization with abdominal infections and gene expression in sigmoid and transverse colon, suggesting causality. Shared significant loci across infections and non-infectious phenotypes in the UK Biobank cohort were found, suggesting associations for example between SNPs identified for abdominal infections and CRP, rheumatoid arthritis, and diabetes mellitus. We report multiple loci associated with bacterial and viral infections. A better understanding of the genetic determinants of bacterial and viral infections can be useful to identify patients at risk and in the development of new drugs.

Bacterial and viral infections are common causes of mortality worldwide 1,2 . As effective antimicrobial treatment is increasingly threatened by the spread of resistant pathogens, new strategies and alternative therapies must be explored to reduce the incidence and burden of these infections 3 . The epidemiological situation, exposure and virulence of the invading pathogen are important to determine the risk of acquiring transmittable diseases, such as respiratory tract infections (RTIs) 4 . Many patient-specific factors, such as older age, malignancies, chronic diseases, and immunosuppression, are also known to increase the incidence and severity of viral and bacterial infections 5,6 . Understanding the risk factors for infections is important in clinical practice to guide strategies for prevention and treatment.
Genome-wide association studies (GWAS) in large population cohorts whereby associations between phenotypes and genetic variants across the whole genome are examined is a powerful tool to discover genetic determinants of disease and to uncover novel biology 7 . To date, such data is scarce for infections, and consequently, the genetic variants predisposing for these diseases are largely unknown. A previous GWAS based on self-reported Association analyses. In total, 81,179 (24%) of the participants had at least one diagnosis indicating a bacterial or viral infection. The number of cases (before or after UK Biobank baseline) for each phenotype included in the GWAS is listed in Table 1. Assuming an additive model, we tested the association between the genotype dosages of each marker and the infection phenotype using logistic regression (Firth's penalized logistic regression in case of non-convergence) using PLINK2 12 . A linear regression with age, sex, PC1-40, and genotype batch (three levels including UKBL interim release, UKBB interim release, and UKBB second release) as predictors and all infections as the outcome was performed in all individuals included in the GWAS. From this model, all PCs up to PC23 reached P < 0.001 and were included as covariates in the GWAS. Associations with P values < 5e−8 were considered significant.
We identified regions containing one or more genome-wide significant SNPs by screening a window of 500 kb adjacent to the first genome-wide significant SNP on each chromosome sorted by genomic position. If no additional SNPs were identified, the region was limited to that specific SNP, and screening was continued at the next GWAS-significant SNP. If additional GWAS-significant SNPs were found, the window was expanded with 300 kb from the last SNP, and screened for additional GWAS-significant SNPs, until there were no more such SNPs within the next 300 kb. Within each region, the SNP with lowest P value was assigned as the index SNP. For each region, conditional association analysis was performed adjusting for all index SNPs found on the chromosome. This distance-based pruning followed by conditional analysis was repeated until no SNPs reached P value < 5e−8. Significant, independent loci with MAF ≥ 1% discovered in the GWAS were compared across the infection phenotypes in this study and all ICD categories (e.g., K57, J18) with ≥ 200 cases and other relevant phenotypes (e.g., smoking, body mass index [BMI]) within the UK Biobank cohort. Linear or logistic regression adjusting for the same covariates as in the infection GWAS was applied for 39 SNPs vs. 743 phenotypes yielding a Bonferroni corrected threshold for significance of P = 1.7e−6. www.nature.com/scientificreports/ missing. Distance to the nearest gene was calculated as the distance from the index SNP to the transcript start or end position (whichever was closest). Regional plots of the association test results were generated for significant loci using LocusZoom v1.4 13 using LD data and GWAS catalog annotations. In the interpretation of result, we focused on SNPs with MAF ≥ 1% and GWAS catalog hits with r 2 ≥ 0.30 and distance ≤ 100 kb from the genetic variant identified in our study. Other hits or nearby genes located within 250 kb from the index SNP, are sometimes discussed if considered biologically relevant to the infectious phenotype. In these cases, the effect allele frequency (EAF), distance and r 2 for that specific SNP are specified in the text.
Fine-mapping of the HLA region. Fine mapping of the human leucocyte antigen (HLA) region was performed due to the critical functions of HLA genes in the immune response, the highly polymorphic nature of the region and high LD between alleles at nearby loci. Imputation of all 11 HLA loci was done centrally using HLA*IMP:02 following the same pre-imputation QC as described for the genome-wide imputation 11 . Dosages for all possible alleles at each HLA locus were tested in separate logistic regression models adjusting for the same covariates as described for the GWAS. Non-tested alleles were assigned a dosage of 0. Only alleles with a minor allele count ≥ 20, calculated separately in cases and controls, were tested. After Bonferroni correction, associations with P < 1.6e−5 were considered significant.
Functional annotation. The Summary data-based Mendelian Randomization (SMR) approach 14 was applied to determine whether associations between SNP and infection phenotype could be explained by known gene expression. SMR analysis was performed by jointly analysing the infection GWAS results and publicly available expression quantitative trait locus (eQTL) summary statistics, thereby assessing potential functional significance of the identified loci pointing to a causal gene. SMR 1.02 was used with the default settings including GWAS results with MAF ≥ 1%. To assess if the GWAS and eQTL association with the phenotype was due to a single shared genetic variant rather than multiple variants in LD with independent effects on the phenotype the heterogeneity in dependent instruments (HEIDI) test was applied 14 . Gene expression data from eQTL studies were obtained from the Genotype Tissue Expression project (GTEx) V7 release 15 and LD data from 1000G phase 3 (v5) EUR was used. To limit the number of tests only gene expression in biologically plausible tissues were considered. Gene expression in the spleen and whole blood was considered potentially important for the immune defence and therefore relevant for all phenotypes. Also, specific tissues were selected for phenotypes where significant SNPs had been found in the GWAS (Table S2).
Heritability. Narrow-sense heritability (h 2 ) explained by additive SNP effects was calculated using LDSC v.1.0.0 16 and observed scale heritability estimates are reported. Infection phenotypes with at least 5000 cases (corresponding to an effective sample size of ~ 20,000) were included using a subset of the SNPs with likely high imputation quality passing the quality filters described for the GWAS, MAF ≥ 1%, and inclusion in HapMap3.
For comparison, we also estimated the SNP heritability by Haseman-Elston (HE) regression in GCTA 1.92.1 17 . Directly genotyped SNPs with MAF ≥ 1% were used to construct the genetic relationship matrix. Results from the HE regression based on the cross-product with the standard error computed using the Jackknife approach are reported. All methods were carried out in accordance with relevant guidelines and regulations.

Results
Genotype-phenotype associations. In total, 57 unique genome-wide significant loci were found across all phenotypes ( Table 2, Fig. S1). Neither the QQ plots ( Fig. S2) nor the genomic control lambda metrics (Table S3) (Table S4). HLA-DQ and HLA-DR are major histocompatibility complex (MHC) class II molecules that play a key role in the adaptive immune response, especially against bacterial infections, by presenting pathogen antigens mainly to the CD4 + T helper cells 18 .
Abdominal infections. Twenty-six significant genetic variants were associated with abdominal infections ( Table 2, Fig. 1A). The results for this phenotype were largely driven by ICD-10 code K57; intestinal diverticular disease and diverticulitis. A sensitivity analysis was performed where K57 was removed from the case definition of abdominal infections. With this updated definition of abdominal infections only three loci reached nominal significance (lead variants rs11428277, rs2049865, and rs377411728), while almost all loci reached P value < 5e−8 when tested for an association with K57 alone (data not shown). One locus (lead variant rs377411728) reached P value < 1e−5 for both K57 and the abdominal infection phenotype excluding K57. The strongest hit was an intronic variant of the ARHGAP15 gene (chr2:rs6717024, P = 1.22e−34) (Fig. 2B). In an in vivo sepsis model, lack of ArhGAP15 (Rho GTPase-activating protein 15), which functions as a negative regulator of multiple neutrophil functions, induced cellular elongation but resulted in more efficient neutrophil migration, phagocytosis, and bacterial killing 19 . Based on these data, ARHGAP was suggested as a therapeutic target to enhance the antibacterial activity of white blood cells and decrease systemic inflammation in septic patients. CRISPLD2 (lead variant rs4782673, P = 1.37e−10), which is expressed in multiple tissues and leukocytes, has previously been associated with mortality in sepsis 20 . In a small case-control study, CRISPLD2 was reduced in patients with septic shock and showed a negative correlation with the bacterial infection biomarker procalcitonin 21   . The asterisk * indicates that associations were found for multiple genes belonging to the HLA-DQ group (e.g., HLA-DQA1, HLA-DQB1). Negative log 10 -transformed P values for each SNP (y axis) are plotted by chromosomal position (x axis). The grey line represents the threshold for genome-wide statistically significant associations (P = 5e−08). Red points represent significant hits, and each significant locus is annotated with the nearest gene. www.nature.com/scientificreports/ presumably through inhibition of the binding of bacterial lipopolysaccharide (LPS) endotoxins to target cells, and consequently reduced induction of the TNF-a and IL-6 cytokine production 22 . Variants in the ARHGAP15 locus were associated with diverticular disease in a GWAS using the UK Biobank cohort 23 and in cohort of Icelandic and Danish cases and controls 24 . In addition to ARHGAP15, most other variants, in or nearby genes SLC35F3, CALCB, COLQ, EFEMP1, LYPLAL1-DT, CRISPLD2, TRPS1, S100A10,  ANO1, LINC01082, DISP2, CACNB2, BDNF, P2RY14, WDR70, ELN, FAM185A, ENPP2, ENTPD7, ABO (close to SURF6), PPP1R14A (close to SPINT2) and MIR2113, were also located closely to SNPs previously associated with diverticular disease 23 . The protein encoded by COLQ (lead variant rs7609897, P = 9.76e−14) influences smooth muscle motility and the neuromuscular junctions between nerve cells and muscle cells, suggesting a biological function in the development of intestinal diverticula. Variants in the COX15 locus (lead intronic variant, chr10:rs11428277; P = 1.69e−11) were previously associated with colorectal cancer 25 and Crohn's disease 26 . The COX15 protein is localized in the inner mitochondrial membrane and has a key function in the electron transport chain. Bacterial invasion in the intestinal mucosa secondary to inflammation or cancer is a plausible biological explanation for the observed associations.
Moreover, several genetic variants were located nearby SNPs or genes of potential importance to susceptibility to other types of infections, the host immune defence and other intra-abdominal conditions. A SNP in the EFEMP1 locus (lead variant rs1802575, 3'UTR, P = 1.56e−11), was previously associated with a history of childhood ear infection 8 . Decreased expression of EFEMP1 (epidermal growth factor-containing fibulin-like extracellular matrix protein) in hepatocellular cancer cells is a predictor of tumour spread and metastasis, and consequently worse prognosis 27 . Interestingly, EFEMP1 acts by promoting SEMA3B, which belongs to the semaphorin family of proteins that regulate multiple physiologic processes including the immune response and cell migration. Reduced levels of SEMA3B in fibroblast-like synoviocytes was found in patients with rheumatoid arthritis, suggesting a role also in the development of autoinflammatory disease 28 . Variants in the SLC35F3 locus (lead variant, chr1:rs4333882; P = 2.14e−14) have been reported associated with levels of the pro-inflammatory cytokine IL-6 29 . The biological function of SLC35F3 is unknown, but IL-6 has a key role in the acute phase response to infections by stimulating the production of neutrophils. SNPs in or nearby TRPS1 (lead variant, rs2049865, P = 4.67e−10) were previously associated with white blood cells and cytokines 30 , and MIR2113 (distance 128 kb from rs9372625, P = 2.02e−9) with the composition of the gut microbiota 31 .
Respiratory tract infections. Seven independent loci were associated with RTI phenotypes (all RTIs, n = 2; bacterial pneumonia, n = 4; influenza and viral pneumonia, n = 1) (Table 2, Fig. 1B). The strongest hit associated with the combined phenotype of all RTIs (chr6:rs28752520, P = 1.77e−10) ( Fig. 2A), located in the HLA-DQA1 locus, was previously found to be associated with blood protein levels 32 . Other variants close to our index SNP, but not strongly correlated, have also shown associations with common infections; plantar warts (distance = 0 kb, r 2 = 0.47 according to the GWAS catalog database), childhood ear infection (18 kb, r 2 = 0.021) and scarlet fever (43 kb, r 2 = 0.026) 8 . A significant variant on chromosome 9 (position 128,648,077, P = 1.94e−10), previously reported to be associated with sleep duration 33 , is located near PBX3; variants in this locus have shown association with squamous cell lung carcinoma 34 . The gene product, pre-B-cell leukaemia transcription factor 3, induced inflammatory response in sepsis in a murine infection model by acting as a competing endogenous RNA for HMGB1 (high-mobility group protein 1) 35 . HMGB1 is produced by macrophages in response to bacterial infections, functioning as an endotoxin-induced cytokine mediator of inflammation, and has been proposed a potential therapeutic target for sepsis 36 .
Previous studies of schizophrenia 37 , cigarette smoking, chronic pulmonary disease 38 and lung cancer 39 reported associations with SNPs that were adjacent (< 10 kb), but not strongly correlated (r 2 ≤ 0.246), to one of the SNPs associated with bacterial pneumonia (lead variant, rs77438700, P = 7.72e−10). Nearby genes of interest include CHRNA3 and CHRNA5; variants in this locus are associated with chronic obstructive pulmonary disease and lung cancer 40 . CHRNA3 and CHRNA5 encode the alpha-type subunit of a cholinergic receptor, which likely mediates the effects of nicotine on the brain. Due to their association with nicotine dependence, the causal variants at this locus probably serve as a determinant of smoking behaviour, subsequently increasing the risks of chronic lung disease and bacterial pneumonia.
Sepsis. The GWAS revealed only four rare variants associated with sepsis ( Table 2) and no significant correlations were found in the GWAS catalog database. Our findings should be interpreted with caution due to the low frequency and limited sample size of this phenotype (n = 4840). Nearby genes of interest include SELE and SELL, which encode the leukocyte cell adhesion receptors Selectin E and L that are involved in leukocyte/endothelium interactions during interleukin-induced inflammation. SELE is associated with Leukocyte Adhesion Deficiency (LAD), a rare autosomal recessive disorder typically presenting with recurrent severe bacterial infections 41 . Selectin L facilitates entry of lymphocytes into the extracellular space 42 , which is an integral process in the immune response to sepsis. Another associated rare variant (EAF cases = 0.0027, lead SNP chr4:rs564716204, P = 1.90e−08) was located nearby LRBA. LRBA (LPS responsive beige-like anchor protein) deficiency is an autosomal recessive genetic disorder caused by mutations resulting in reduced expression and function of the cytotoxic T lymphocyte-associated protein 4 (CTLA4) 43 . This condition is associated with low levels of immunoglobulins (IgG, IgM, IgA), repeated infections due to impaired humoral immune response, and increased risk of autoinflammatory diseases (e.g., diabetes mellitus, inflammatory bowel disease).
Other phenotypes. Multiple loci, most of which are novel in the context of infectious diseases, were found for the remaining phenotypes: gastroenteritis (n = 6), heart infections (n = 4), sexually transmitted diseases (n = 1), skin infections (n = 1), specified viral infections (n = 1), UTI (n = 5) and urogenital (non-UTI) infections (n = 2) Scientific Reports | (2022) 12:2582 | https://doi.org/10.1038/s41598-022-05838-z www.nature.com/scientificreports/ (Table 2). A genetic variant associated with skin infections in our study (lead SNP chr5:rs6595799, P = 2.39e−08) is highly correlated (r 2 ≥ 0.9) and close to SNPs near LINC01184. LINC01184 is a long intergenic non-protein coding RNA that is differentially expressed in many types of cancers that has previously been reported associated with cancer 44,45 , blood cell traits 46 and other phenotypes 47 . One of the strongest hits for heart infections in our study (chr14:rs182592259, P = 8.73e−09) was located near BDKRB2, which encodes the bradykinin B2 receptor that has a protective role in the development of hypertension and cardiovascular disease 48 , thereby potentially affecting also the vulnerability to infections.

Functional annotation.
We identified a total of 91 colocalization events representing 4, 15 and 23 unique traits, tissues and genes respectively. PPP1R14A showed the strongest colocalization with abdominal infections, and colocalized with eQTL signals in both sigmoid and transverse colon tissue with lead variants in strong LD (r 2 > 0.98; strongest association lead SNP rs4803934, SMR, P = 4.68e−10) (Table S5). Neighboring genes did not show a similar pattern of colocalization with the GWAS signal in this locus (Fig. 3). PPP1R14A, also known as CPI-17, belongs to the protein phosphatase 1 (PP1) inhibitor family which has a key role in the adjustment of smooth muscle contraction in response to physiological stimuli 49 . PPP1R14A has shown associations with different cancer types in prior large-scale GWASs, including colon and prostate cancer 50,51 . Evidence from these studies points to a transcriptionally mediated effect; imputed PPP1R14A expression, derived as a linear combination of cis genotypes associated with expression of the gene, showed association with prostate cancer in two independent cohorts 51 . This locus also shows association with diverticular disease 23 . We observed that several colocalization events for abdominal disease occurred with genes in the HLA region; SMR and HEIDI analyses identified HLA-DQA2, HLA-DRB6 and HLA-DRB1. Due to the complex LD structure in the HLA region, there is likely to be additional pleiotropy occurring with these discoveries. Although we performed HLA region fine-mapping to detect more closely resolved association signals, we did not perform SMR and HEIDI analyses with the fine-mapped data. Colocalizations for HLA regions were observed in 11 distinct tissues including whole blood and spleen; many tissues likely share eQTLs that underlie these results. We also observed a colocalization between ABO expression and abdominal infections in the adipose visceral omentum (lead SNP rs505922; P = 6.91e−06). Associations between blood types and infections were amongst the earlier identified associations between molecular traits and phenotypes 52 . There is evidence that individuals with different blood types have varying levels of susceptibility to acute pyelonephritis 53 , presumably mediated by the expression of receptors in the endothelium, and abdominal infections 54 . Finally, we observed colocalizations between abdominal infections and the colon expression of NOV (also called CCN3, lead variant rs61100635, P = 1.75e06) and DISP2 (rs2289328; P = 1.27e−06).
Heritability. The narrow-sense heritability on the observed scale was low (0-4%) for all phenotypes (Table S6) with the highest heritability found for abdominal infections. The difference in heritability could partly be related to the phenotype definitions; the phenotype of abdominal infections was more homogenous than RTIs, which included multiple infections of varying severity and pathogens. Moreover, the genetic component is likely higher in endogenous infections, such as abdominal infections, which are normally caused by bacteria of the host's microbiome, compared to exogenous infections, including viral RTIs or gastroenteritis, which depend on exposure and acquisition of a transmittable pathogen.
Phenome-wide associations in the UK Biobank. The analysis of shared significant loci across infectious and non-infectious phenotypes in the UK Biobank cohort revealed associations between one of the SNPs identified for abdominal infections (rs570640158) located in the HLA region, and phenotypes related to infection, inflammatory and autoimmune diseases, including CRP, asthma (ICD-10 code J45), diabetes mellitus (E10, E11) and rheumatoid arthritis (M05, M06) (Fig. S3B). There were multiple shared SNP associations between abdominal infections and diverticular disease (ICD-10 code K57), as discussed above, and rs77438700 was associated both with bacterial pneumonia (ICD-10 code J18) and smoking.

Discussion
In this study, we explored genetic determinants of the susceptibility to phenotypes representing 18 bacterial and viral infection entities and identified 57 unique loci associated with at least one of the phenotypes. While many of detected significant variants are novel in the context of infectious diseases, the same or strongly correlated SNPs, and nearby genes of potential relevance in the pathophysiology of infections, were frequently found in previous literature. Most SNPs detected for abdominal infections were located close to loci reported associated to diverticular disease and diverticulitis (ICD-10 code K57), which was also the main driver of results for this phenotype in our study, in a GWAS by Maguire et al. 23 .
As expected, some of the identified loci are associated with infectious diseases or components of the host immune defence against bacterial and viral infections, such as the HLA region. Our findings align with a previous GWAS in which genetic variants in the HLA region were associated with several self-reported infections (e.g., mononucleosis, mumps, pneumonia, and tuberculosis) 8 . Bacterial infections are typically associated with MHC-II genes, and viral infections with the MHC-I region, which is important for peptide recognition in CD8 + cytotoxic T cells. The HLA region is also associated with multiple immunological traits including selective IgA deficiency, the most common primary immunodeficiency in Europeans 55 and autoimmune diseases such as rheumatoid arthritis 56 , systemic lupus erythematosus 57 and ulcerative colitis 58 . Interestingly, one study showed that one of the genes and its products, HLA-DQA2, is often transferred from cancerous cells to normal cells via extracellular vesicles in malignant colon cancer 59     www.nature.com/scientificreports/ Genetic variants in the TRPS1 (rs2049865, P = 4.67e−10) and LINC01184 (rs6595799, P = 2.39e−08) loci, associated with abdominal infections and skin infections, respectively, showed associations with neutrophil and lymphocyte counts in a cohort of ~ 175,000 European-ancestry participants 30 . White blood cells are key components in the innate and adaptive immune responses to bacterial and viral infections 60 . Abdominal infections and gastroenteritis were associated with variants located in the SLC35F (lead SNP rs4333882, P = 2.14e−14) and FSTL5 loci (rare variant, EAF cases = 0.0014, lead SNP rs115809651. P = 8.07e−10), respectively. Although the biological functions of these genes are unknown, their associations with blood levels of cytokines (chemokines, interleukins, interferons) 29 suggest potential importance for the innate immune response. Cytokines are key components in the biochemical pathways affecting migration and activation of white blood cells 60 and are also fundamental in the biological processes of autoinflammatory diseases such as rheumatoid arthritis 61 and inflammatory bowel diseases 62 .
Biologically plausible correlations were found between some of the infection phenotypes and chronic diseases, most frequently autoimmune diseases and cancer. While such co-morbidities increase the susceptibility for secondary infections, common genetic determinants that increase the risk for infections, inflammatory disease and malignancies could exist and be revealed either through studies of local genetic correlation or colocalization between traits. In this study, we observed colocalization of a variant associated with abdominal infections and gene expression in colon, suggesting causality of PPP1R14A in this class of infections.
This study has several strengths and limitations. To our knowledge, this is the largest interpreted GWAS to date on bacterial and viral infections using carefully determined compound phenotypes for important infection www.nature.com/scientificreports/ categories. External validation would have greatly added to the results but was not possible as other comparative data were unavailable. Replication using smaller biobanks with electronic health data would also be valuable to validate our findings. The definition of phenotypes based on specific diagnosis codes is a strength of this study, which is likely to increase sensitivity and specificity in relation to previous studies using self-reported history of diseases or ICD-10 codes without any curation. Still, some misclassifications are expected where the diagnosis set by the treating physician did not accurately describe the clinical syndrome; this situation may have resulted in false positive or negative cases and decreased the power of our analyses. It should be noted that there was sometimes an overlap in ICD-10 codes between phenotypes. As expected, there was some discrepancy in results between the combined phenotypes and subgroups (such as all RTIs vs. bacterial pneumonia). While the larger phenotypes are helpful to capture genetic variants related to the general systemic or local host immune defence, more specific phenotypes and larger cohorts may be required to find for example genetic determinants of pathogen-specific endothelial adhesion molecules. The conservative approach of refining the study cohort to correct for population structure and cryptic relatedness may have resulted in a lower estimated heritability. Further study is required to determine whether our observations result from genetic determinants affecting the risk for several disease groups or causal effects of co-morbidities that increase the vulnerability to infections.

Conclusions
In conclusion, we report multiple novel loci associated with bacterial and viral infections in a large population cohort and provide interpretation of these results in the context of previous literature. Our results add significantly to the limited existing data and biological insights in this field. The genetic determinants of infectious disease susceptibility identified in this study could potentially be used to help identify target genes for the development of novel therapeutics for prevention or treatment of these diseases.